Virus Research, 21 (1991) 181-198 

© 1991 Elsevier Science Publishers B.V. All rights reserved 0168-1702/91/$03.50 


181 


VIRUS 00713 


The cloning and sequencing of the virion protein 
genes from a British isolate of porcine respiratory 
coronavirus: comparison with transmissible 
gastroenteritis virus genes 


Paul Britton, Karen L. Mawditt and Kevin W. Page * 

Division of Molecular Biology A.F.R.C., Institute for Animal Health, Compton Laboratory, Compton, U.K. 
(Received 28 June 1991; revision received 9 August 1991; accepted 12 August 1991) 


Summary 


Previous analysis of porcine respiratory coronavirus (PRCV) mRNA species 
showed that mRNAs 2 and 3 were smaller than the corresponding transmissible 
gastroenteritis virus (TGEV) mRNA species (Page et al. (1991) J. Gen. Virol. 72, 
579-587). Sequence analysis showed that mRNA 3 was smaller due to the 
presence of a new putative RNA-leader binding site upstream of the PRCV 
ORF-3 gene. However, this observation did not explain the deletion observed in 
PRCV mRNA 2. Polymerase chain reaction (PCR) was used to generate cDNA 
from the 3’ coding region of the putative polymerase gene to the poly (A) tail of 
PRCV for comparison to the equivalent region from TGEV. The PRCV S protein 
was found to consist of 1225 amino acids, which had 98% similarity to the TGEV S 
protein. However, the PRCV S gene contained a 672 nucleotide deletion, corre¬ 
sponding to 224 amino acids (residues 21 to 245 in TGEV S protein), 59 nu¬ 
cleotides downstream of the S gene initiation codon. The PRCV genome from the 
ORF-3 gene to the poly (A) tail was sequenced for comparison to TGEV in order 
to identify other potential differences between the two viruses. Four ORFs were 
identified that showed 98% similarity to the TGEV ORF-4, M, N and ORF-7 
genes. No other deletions or any PRCV specific sequences were identified. 
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Introduction 

Porcine respiratory coronavirus (PRCV) appeared about 1984 and rapidly 
spread throughout the pig population in several, if not all, European countries 
where it now persists enzootically (Pensaert et al., 1986) and was recently isolated 
in the United States (Wesley et al., 1990a). The virus, initially isolated in Belgium 
(Pensaert et al., 1986) and in Britain (Brown and Cartwright, 1986), produced 
serological responses following infection of pigs that could not be distinguished 
from transmissible gastroenteritis virus (TGEV), by available diagnostic tests. The 
most striking difference between the two viruses was seen in their pathology. 
PRCV grows principally in the respiratory tract, producing mild or no clinical signs 
(Pensaert et al., 1986; Pensaert, 1989; O’Toole et al., 1989; Cox et al., 1990), 
although, Van Nieuwstadt and Pol (1989) found that a Dutch isolate of PRCV 
intranasally inoculated into SPF pigs caused a fatal pneumonia. In contrast, 
although TGEV can grow in the respiratory tract, the virus preferentially grows in 
the enterocytes covering the tips of the villi in the small intestine, causing 
diarrhoea and dehydration resulting in high morbidity and mortality in neonatal 
pigs. 

PRCV has been classified as a coronavirus, a group of enveloped viruses with a 
positive-stranded RNA genome, belonging to the family Coronaviridae. TGEV 
(Britton et al., 1986; Jacobs et al., 1986) and PRCV (Britton et al., 1990) infected 
cells, in addition to the genomic RNA (mRNA 1), have six species of subgenomic 
mRNA (mRNAs 2-7) which form a 3' co-terminal “nested” set. The TGEV 
(Garwes and Pocock, 1975) and PRCV (Britton et al., 1990) virions contain three 
major structural polypeptides; a surface glycoprotein (spike (S) with a monomeric 
relative molecular mass (M r ) 200,000, a glycosylated integral membrane protein 
(M), observed as a series of polypeptides between M T 28,000-31,000 and a basic 
phosphorylated nucleoprotein (N) of M r 45,000 associated with the viral genomic 
RNA. However, the PRCV S protein appears to have a slightly lower M r than 
TGEV on polyacrylamide gels (Britton et al., 1990). 

PRCV was neutralised in vitro by antisera prepared against TGEV and the 
majority of monoclonal antibodies (MAbs) raised against any of the TGEV virion 
proteins cross reacted with PRCV (Sanchez et al., 1990). However some MAbs, 
raised against the S protein of either the virulent British strain FS772/70 (Garwes 
et al., 1988) or the avirulent Purdue strain of TGEV (Callebaut et al., 1988), did 
not recognise PRCV. Callebaut et al. (1988) showed that the MAbs which did not 
react with PRCV, produced from the Purdue strain of TGEV, were derived from 
three separate antigenic sites and mapped between amino acid residues 17 and 325 
(Correa et al., 1990). However, MAbs derived from one of these epitopes on the 
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Purdue-115 strain did not react with TGEV strains FS772/70 and Miller due to a 
point mutation in the S protein sequences. 

The molecular characterisation of a British isolate of PRCV has been under¬ 
taken with the aim of identifying differences between the PRCV and TGEV 
genomes which may be linked to the different tropisms and pathogenicities of the 
two viruses. Initial work found that: (1) the putative leader RNA sequences 
present on the viral RNA species, postulated to be involved in transcription of the 
mRNA species, were identical for TGEV and PRCV (Page et al., 1990): (2) two of 
the PRCV mRNA species, 2 and 3, were smaller than the corresponding TGEV 
species (Britton et al., 1990; Page et al., 1991): (3) sequencing studies on the PRCV 
genome, corresponding to the 5'-end of TGEV mRNA 3, identified several small 
deletions (84 nucleotides) resulting in the loss of the potential TGEV ORF-3a 
gene (Page et al., 1991). However, the 84 nucleotides deleted did not account for 
the size difference, about 600 nucleotides, observed between TGEV and PRCV 
mRNA 2 nor did the deletions explain the differential binding of the MAbs. In this 
paper we present sequence data of the PRCV S gene to identify any variation to 
the TGEV S gene that would account for the size difference of mRNA 2 and the 
differential reaction of the MAbs and the sequence of the PRCV genome from the 
ORF-3 gene to the 3' poly (A) tail to identify any other variations in the PRCV 
genome. 


Materials and Methods 

Preparation of viral RNA 

Viral RNA was isolated from LLC-PK1 cells infected with PRCV strains 
86/137004 or 86/135308 as described by Britton et al. (1987) and Page et al. 
(1990). 

Preparation of oligonucleotide primers 

Oligonucleotides used for PCR amplifications were synthesised by the phospho- 
ramidite method on an Applied Biosystem 381A DNA synthesizer. These were 
derived from published TGEV sequence data (Britton et al., 1988a; 1988b; 1989; 
Page et al., 1990; Britton and Page, 1990) and are listed in Table 1. Fig. 1 shows 
the position of the oligonucleotides on the TGEV viral genome. 

Cloning of PCR generated fragments 

Synthesis of first-strand cDNA was carried out in 30 jil samples containing 5 pg 
total RNA (isolated from virus infected cells), 40 U RNasin (Promega), 50 mM 
Tris-HCl pH8.3, 3 mM MgCl 2 , 75 mM KC1, 10 mM DTT, 2.5 mM dNTPs and 
primed with 160 ng of oligos 55, 14, 17, 41, 51, 25 or 85 using 23 U of avian 
myeloma virus (AMV) reverse transcriptase (Super-RT; Anglian Biotech) at 42 ° C 



184 


Table 1 

Sequence of oligonucleotide primers used for PCR amplifications 


OLIGONUCLEOTIDE 

SEQUENCE 

SENSE 

oligo 

55 

5'-AGTAACACAACACTCTTA-3' 

_ 

oligo 

32 

5 ' —TGTTGCCATTAAAATCA—3' 

+ 

oligo 

14 

5'—CGTGACGTTACCAGTGC-3' 

- 

oligo 

18 

5' —AGATTGCTATTAGTAAG—3' 

+ 

oligo 

17 

5'—ACATACTAAGTCAGCTA—3' 

- 

oligo 

13 

5' —CAGTGCTACACCTAGATTCATG—3' 

+ 

oligo 

41 

5 > —TTTTCAATAGGTTCGTA—3' 

— 

oligo 

76 

5'—AAACGTAAGTATCGTTCAG—3 ' 

+ 

oligo 

51 

5'-CTGTCCTTCCTAAATTGCAACACACCATGCATAGC-3' 

- 

oligo 

52 

5'—GGCCTTGGTATGTGTGGCTACTAATAGGC—3' 

+ 

oligo 

25 

5' -GGTATGTTATTACTCTTC—3' 

- 

oligo 

60 

5' —GTGTCGGCATCTTAATG-3' 

+ 

oligo 

85 

5'—TTTTTGTATATCACTATC-3' 

- 

oligo 

75 

5'-CCTTTTAAAGTAAAGTGAGT-3' 

+ 


Note: all the above oligonucleotide sequences were derived from the sequence of TGEV strain 
FS772/70 except oligo 75 which corresponds to the 5'-end of the leader RNA sequence from TGEV 
FS772/70, PRCV 86/137004 and PRCV 86/135308. 



s 
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Fig. 1. Schematic diagram of the TGEV/PRCV genome showing the position of the oligonucleotides 
used for PCR cloning. The arrow heads show the position and the orientation of the oligonucleotides. 
The boxes show the positions of the TGEV/PRCV genes. The lines show the sizes of the PCR 
amplified fragments expected from the TGEV sequence and the dotted lines the sizes of the PRCV 
fragments if different from the equivalent TGEV fragments. The L denotes the position of the putative 
leader RNA sequences, upstream of the N and ORF-7 genes, on mRNA species 6 and 7 from which 

oligo 75 was used for PCR amplifications. 
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for 90 min. The ssDNA was PCR amplified, using the oligonucleotide primers 
shown in Table 1, following a protocol supplied with the AmpliTaq™ kit (Perkin- 
Elmer-Cetus) in a Techne PHC-1 programmable thermal cycler using 35 cycles of 
94 ° C for 1 min, 40 ° C for 2 min and 72 ° C for 3 min with a final elongation step of 
72 0 C for 9 min. The PCR generated cDNA fragments were purified by agarose 
gel electrophoresis and isolated from the gel using Geneclean™, The cDNA was 
5'-phosphorylated using T 4 polynucleotide kinase (Gibco, Bethesda Research 
Laboratories) and any incomplete ends repaired using Klenow fragment (Phar¬ 
macia), prior to ligation into Sma I-cut dephosphorylated pUC13 (Pharmacia). The 
resulting plasmids were transformed into E, coli strain JM105 and ampicillin 
resistant transformants directly analysed by a modification of the method of 
Giissow and Clarkson (1989). The transformants were grown in 2 ml of X2 LB 
containing 100 jagml -1 ampicillin of which 400 /il samples were centrifuged, the 
cell pellets resuspended in 500 gl H 2 0 and boiled for 5 min. The cell debris was 
centrifuged for 5 min and 10 ji 1 aliquots of the supernatants PCR amplified, using 
universal and reverse primers, for 30 cycles of 94 °C for 1 min, 55 °C for 2 min, 
72 °C for 2 min with a final elongation step at 72 °C for 9 min. The reaction 
products were analysed on 1% agarose gels. Plasmid DNA was isolated from 
transformants containing PRCV cDNA fragments of the expected length as 
outlined in Fig. 1. 

Sequencing of cloned PCR fragments 

PRCV cDNA was cut from plasmids, using BamHl and EcoRl, and ligated 
into EcoRI and BamHl digested M13 mpl8 and 19 phage vectors. The PRCV 
cDNA was sequenced, from the Ml 3 ssDNA templates using the Sequenase™ 
(United States Biochemical Corporation) protocol and oligonucleotide primers 
derived from TGEV sequence data. Each cDNA fragment was sequenced several 
times in both directions to eliminate any ambiguous data. 

Data handling and analysis 

A sonic digitizer (Graf/Bar; Science Accessories Corporation) was used to read 
data into a Elonex PC-286 microcomputer and data were analysed on a MicroVAX 
3600 using the computer programs of Staden (1982), the University of Wisconsin 
Genetics Computer Group (UWGCG; Devereux et al., 1984) and CLUSTAL 
(Higgins and Sharp, 1988). 


Results 

Cloning of PRCV RNA 

The PRCV (86/137004) genome upstream of the S gene to the poly (A)-tail was 
cloned by PCR amplification. Eight cDNA fragments of 1298 bp (A), 1247 bp (B), 
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950 bp (C), 1475 bp (D), 1367 bp (E), 1649 bp (F), 1774 bp (G) and 605 bp (H) 
were amplified from PRCV RNA using appropriate primers (Fig. 1). The sizes of 
fragments B, C, D, F, G and H were as expected from the TGEV sequence data, 
however, fragments A and E were 1298 bp and 1367 bp in contrast to the TGEV 
1970 bp and 1451 bp fragments (Fig. 1). The size difference, 84 bp, in PRCV 
fragment E was due to the several small deletions reported by Page et al. (1991). 


End of Oligo 32 

MTCACGGAATTTAGTTGCAATAAATATTTATATGAATTGATTCAAAGATTTGAGTATTGGACTGTGTTTTGTACAAGTGTTAACACGTCATCATCAGAAGGCTTTCTGATTGGTGTTAA 120 
JTEFSWNKYLYELIQRFEYWTVFCTSVNTSSSEGFLIGVN 40 

d r 

CTACTTAGGACCATACTGTGACAAAGCAATAGTAGATGGGAATATAATGCATGCCAATTATATATTTTGGAGAAATTCTACAATCATGGCTCTATCACATAACTCAGTCCTAGACACTCC 240 

ylgpycdkaivdgnimhanyifwrnstimals'knsvldtp 80 


TAAATTCAAGTGTCGTTGTAACAACGCACTTATTGTTAATTTAAAAGAAAAAGAATTGAATGAAATGGTTATTGGATTACTAAGGAAGGGTAAGTTGCTCATTAGAAATAATGGTAAGTT 360 
KFKCRCWNALIVNLKEfCELNEMVIGLLRKGKLLIRNNGKL 120 

actaaactttggtaaccactttattaacacaccatgaaaaaattatttgtggtcttggttgtaatgccattgatttatggagacaagtttcctacuccgtagtttccaattgcactgat 480 

LNFGNHFINTP* t 131 

V <1) 

Spike MKKLFVVLVVMFLIYG DKFPTSVVSNCTD 29 

CAATGTGCTAGTTATGTGGCTAATGTTTTTACTACACAGCCAGGAGGCTTTATACCATCAGATTTTAGTTTTAATAATTGGTTCATCCTAACTAATAGCTCCACGTTGGTTAGTGGCAAA 600 

QCASYVANVFTTQPGGF1PSDFSFNNWFILTNSSTLVSGK 69 

TTAGTTACCAAACAGCCTCTATTAGTTAATTGCTTATGGCCAGTCCCTAGCTTTGAAGAAGCAGCTTCTACATTTTGTTTTGAAGGTGCTGACTTTGATCAATGTAATGGTGCTGTTTTA 720 

LVTKOPLLVNCLWPVPSFEEAASTFCFEGADFDQCNGAVl, 109 

AATAACACTGTAGACGTCATrAGGTTTAACCTTAATTTTACTACAAATGTACAATCAGGTAAGGGTGCTACAGTGTTTTCATTGAACACAACGGGTGGTGTCACTCTTGAAATCTCATGT 840 

NNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGCVTLEISC 149 

TATAATGATACAGTGAGTGATTCGAGCTTTTCCAGTTACGGTGAAATTCCGTTCGGCGTAACTAATGGACCACGGTACTGTTACGTACTCTATAATCGCACAGCTCTTAAGTATCTAGGA 960 

YNDTVSDSSFSSYGEIPPGVTNGPRYCYVLYNGTALKYLG 189 

Oligo 18 
--- 

ACATTACCACCTAGTGTCAAGGAGATTGCTATTAGTAAGTGGGGCCAXTTTTATATTAATGGTTACAATTTCTTTAGCACATTTCCTATTGATTGTAXATCTTTTAATTTGACTACTGGT 1080 
TLPPSVKEIAISKWGHFYINGYHFFSTFPIOCISFNLTTG 229 
GATAGTGACGTCTTCTGGACAATAGCTTACACATCGTACACTGAAGCATTAGTACAAGTTGAAAACACAGCTATTACAAATGTGACGTATTGTAATAGTTATGTTAATAACATTAAATGC 1200 
DSDVFWTIAYTSYTEAlVQVENTAlTNVTYCNSYVNNlKC 269 

Oligo 55 

TCTCAACTTACTGCTAATrTGAATAATGGATTTTATCCTGTTTCTTCAAGTGAAGTTGGTTCTGTCAATAAGAGTGTTGTGTTACTACCTAGC'rTTCTGACACATACCAT'rGTTAACA'rA 1320 
SQLTANLNNGFYPVSSSEVGSVNKSVVLLPSFLTHTIVN I 309 
ACTATTGGTCTTGGTATGAAGCGTAGTGGTTATGGTCAACCCATAGCCTCAACGCTAAGTAACATTACACTACGAATGCAGGATAACAACACCGATGTGTACTGTGTTCGTTCTGACGAA 1440 
T1GLGMKRSGYG0PIASTLSNITLPMQDNNTDVYCVRSDQ 349 

TTTTCAGTTTATGTTCATTCTACTTGCAAAAGTGCTTTATGGGACAATGTTTTTAAGCGAAACTGCACGGACGTTTTAGATGCCACAGCTGTTATAAAAACTGGTACTTGTCCTTTCTCA 1560 
FSVYVHSTCKSAtWONVFKRNCTDVLDATAVIKTCTCPFS 389 
TTTGATAAATTGAACAATTACTTAACTTTTAACAAGTTCTGTTTGTCGTTGAGTCCTGTTGGTGCTAATTGTAAGTTTGATGTAGCTGCCCGTACAAGAACCAATGATCAGTTTGTTAGA 1680 
FDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRTNDQFVR 429 

Oligo 13 

——- — - > 

AGTTTGTATCTAATATATGAACAAGGAGACAGCATAGTTGGTGTACCGTCTGACAAI’AGTGGTTTGCACGATTTGTCAGTGCTACACCTAGATTCGTGCACAGATTACAATATATATGGT 1800 
SLYVXYEEGDSlVGVPSDNSGLHDLSVItHLDSCTDYNIYG 469 

agaactggtgttggtattattagacaaactaacaggacgctacttagtggcttatattacacatcactatctggtgatttcttaggttttaaaaatgxtagtgatggtgttatctactct 1920 
RTGVGIIRQTNRTLLSGLYYTSLSGDLLGPKNVSDGVIYS 509 
GTAACGCCATGTGATGTTAGCGCACAAGCAGCTATTATTGATGGTGCCATAGTTGGGGCTATCACTTCCATTAACAGTGAATTGTTAGCTCTAACACATTGGACAATAACACCTAATTTT 2040 
VTPCDVSAQAAIIDGAIVGAITSINSELLALTHWTITPHF 54 9 

TATTACTACTCTATATATAATTACACAAATGATAAGACTCGTGGCACTCCAATTGGCAGTAATGACGTTGATTGTGAACCTGTCATAACCTATTCTAACATAGGTGTTTGTAAAAATGGT 2160 
yyYSIYNYTNDKTRGTPIGSNDVDCEPVITYSNJGVCKNG 589 

Oligo 14 

GCTTTGGTTTTTATTAACGTCACACATTCTGATGCAGACGTGCAACCAATTAGCACTGGTAACGTCACGATACCTACTAACTTTACTATATCCGTGCAAGTCGAATACATTCAGGTTTAC 2280 
ALVFINVTHSDGDVQPISTGMVTIPTNFTISVOVEYrQVY 629 

ACTACACCAGTGTCAATAGACTGTTCAAGATATGTTTGTAATGGCAACCCTAGGTGTAACAAACTGTTAACACAATACGTTICTGCATGTGAAACTATTGAGCAAGCACTTGCAATGGGT 2400 
TTPVSIDCSRYVCNGNPRCNKLLTQYVSACQT1EQALAMG 669 
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GCCAGACTTGAAAACATGGAAGTTGATTCCATGTTATTTGTTTCTGAAAATGCCCTTAAATTGGCTTCTGTCGAAGCATTCAATAGTTCAGAAACTTTAGATCCTATTTACAAAGAATGG 

ARLENMEVDSMLFVSENALKLASVEAFHSSETLDPIYKEW 

Oligo 76 

-> 

CCrAATATAGGTGGCTTTTGGCTAGAAGGTCTAAAATACATACTTCCGTCCGATAATAGCAAACGTAAGTATCGTTCAGCTATAGAGGACTTGCTTTTTTCTAAGGTTGTAACATCTGGT 

PNIGGFWLEGLKYILPSDNSKRKYRSAIEDLLFSKVVTSG 

Oligo 17 

<-. -- 

TTAGGTACAGTTGATGAAGATTACAAACCTTGTACAGGTGGTTATGACATAGCTGACTrAGTATGTGCTCAATACTATAATGGCATCATGGTGCTACCTGGTGTGGCTAATGCTGACAAA 

LGTVDEDYKRCTGGYDIADLVCAQYYNGIMVLPGVANADK 


ATGACTATGTACACAGCATCCCTCGCAGGTGGTATAACATTAGGTGCACTTGGTGGAGGCGCCGTGGCTATACCTTTTGCAGTAGCAGTTCAGGCTAGACTTAArrATGTTGCTCTACAA 

MTMYTASLAGGITLGALGGGAVAIPFAVAVQARLNYVALQ 

ACTGATGTATTGAACAAAAACCAGCAGATCCTGGCTAGTGCTTTTAATCAAGCTATTGGTAACATTACACAGTCATTTGGTAAGGTTAATGATGCTATACATCAAACATCACGAGGTCTT 

TDVLNKNQQILASAFNQAIGNITQSFGKVNDAIHQTSRGL 

acaactgttgctaaagcattggcaaaagtgcaagatgttgtcaacacacaaggtcaagctttaagacacctaacagtacaattgcaaaataatttccaagccattagtagttctattagt 
ttvakalakvqdvvntqgqalrhltvqiqnnfqaisssis 
GAC ATTTATAAT AGGCTTGATGAATTGAGTGCTG ATGC ACAAGT CGAC AGGCTG ATCACAGGAAGACTTACAGCACTTAATGCATTTGTGTC TCAGACTCTAACCAG ACAAGCCGAGGTT 

DIYNRLDELSADAQVDRL1TGRLTALNAFVSQTLTRQAEV 

AGGGCTAGTAGACAACTTGCTAAAGACAACGTTAATGAATGCGTTAAGTCTCAGTCTCATAGATTCGGCTTCTGTGGTAATGGTACACATTTGTTTTCACTCGCAAATGCAGCACCAAAT 

rasrqlakdkvnecvksqshrfgfcgngthefslanaapn 

GGCATGATCTTCTTTCACACAGTGCTATTACCAACGGCTTATGAAACTGTGACTGCTTGGTCAGGTATTTGTGCTTTAGATGGTGATCGCACTTTTGGACTTGTCGTTAAAGATGTCCAG 

GMIFFHTVLLPTAYETVTAWSGICALDGDRTFGLVVKDVQ 

TTGACTTTATTTCGTAATCTAGATGACAATTTCTATTTGACACCCAGAACTATGTATCAGCCTAGAG1GGCAACTAGTTCTGATTTTGTTCAAATTGAAGGGTGCGATGTGCTGTTTGTT 

LTLFRNLDDNFYLTPRTMYQPRVATSSDFVQIEGCDVLFV 

AATACAACTGTAAGTGATTTGCCTAGTATTATACCTGATTATATTGATATTAATCAGACTGTTCAAGACATATTAGAAAATTTTAGACCAAATTGGACTGTACCIGAGCTGACTATGGAC 

? » » 

NTTVSDLPS I I PDY IDINQTVQD ILENFRPNWTVP ELTMD 

GTTTTTAACGCAACCTATTTAAACCTGACTGGTGAAATTGATGACTTAGAGTTTAGGTCAGAAAAGCTACATAACACTACTGTAGAACTTGCCATTCTCATTGACAACATTAACAATACA 
* * * * 
vfnatylnltgeiddlefrseklhnttvelailidhinnt 

Oligo 52 

--- > 

TTAGTCAATCTTGAATGGCTTAATAGAATTGAAACTTATGTAAAATGGCCTTGGTATGTGTGGCTACTAATAGGCTTAGTAGTAATATTTTGCATACCATTACTGCTATTTTGCTGTTGT 

LVNLEWLNRIETYVKWP wyvwll iglvv ifciplllfccc 

Oligo 41 

< - 

AGTACAGGTTGCTGTGGATGCATAGGTTGTTTAGGAAGTTGTTGTCACTCTATATTCAGTAGAAGACAATTTGAAAATTATGAACCTATTGAAAAAGTGCACGTCCATTAAATTTAAAAT 

( 11 ) 


CGC IGCLGSC 


IFSRRQFENYEP IEKVHVH 


GTTAATTTTA|CTGCTATAATATCATTTGTTGTTAAGGATGATGAATAAAGAACTTTCAAGTCAGTCAAATTTACTAATACATCCGTGGACGTTGTACTTGACGAACTTGATTGTGTATA 


V K F T 


L D C V Y 


V T L K 


I G F G D I 


2520 

709 

2640 

749 

2760 
789 
2880 
829 
3000 
869 
3120 
909 
3240 
94 9 
3360 
989 
3480 
1029 
3600 
1069 
3720 
1109 
3840 
1149 

3960 

1189 

4080 

1225 

4200 


4320 


Fig. 2. The nucleotide and deduced amino acid sequences of the carboxyl-terminus of the lb 
polymerase and S genes, including the ORF-3a pseudogene, from PRCV 86/137004. The horizontal 
arrows show the position and orientation of the primers used to generate the PCR fragments. The 
ACTAAAC sequence upstream of the S gene is identified with a thick line. Amino acids below the 
PRCV polymerase sequences are substitutions found for the FS772/70 strain of TGEV. The double 
underlined sequence at the beginning of the S gene is the predicted N-terminal signal sequence. 
Potential N-glycosylation sites (NXT or NXS) are identified with a black triangle. The double 
underlined sequence at the carboxyl-terminus of the S protein shows the position of the potential 
transmembrane domain. The numbered vertical arrows shown below the PRCV nucleotide sequence 
indicate the positions of the deletions found when compared to the TGEV FS772/70 sequence, 
corresponding to (i) 672, (ii) 9 (this deletion is also found in TGEV strains Purdue-115 and Miller), (iii) 
13, (iv) 22 and (v) 36 nucleotides (Page et al., 1991). The lettered vertical arrows below the PRCV 
sequence indicate insertions, corresponding to (A) 3, (B) 16 and (C) 29 nucleotides (Page et al., 1991), 
found in either TGEV Purdue-115, (B) and (C), or both Purdue-115 and Miller, (A), that are not found 
in either TGEV FS772/70 or PRCV. These sequence data will appear in the EMBL/GenBank/DDBJ 
Nucleotide Databases under the accession number X 60089. 
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ORF-4 MTFpRALTVIDDNGMVIS I IF 21 
FS772/70 L 

Purdue—115 n 


TTGGTTCCTGTTGATAATTATATTGATATTACTTTCAATAGCATTGCTAAATATAATTAAGCTATGCATGGTGTGTTGCAATTTAGGAAGGACAGTTATTATTGTTCCAGTGCAACATGC 240 
WFLLI I ILILLSI ALLNI IKLCMVCCNLGRTVI IVPVQHA 61 

F t 

A A 

ttacgatgcctataagaattttatgcgaattaaagcatacaaccctgatggagcactccttgtttga^^IIaacaaaatgaagattttgttgatattagcgtgtgcgattgcatgcacat 360 

YDAYKNFMRIKAYNPDGALLV* 82 

H Membrane MKILLILACAIACTC 15 

A FS772/70 l . ' ' V . ““"A 

Purdue-115 V A 

'CTGATTCTGACTl 

GERYCAMKDDTGLSCRNGTASDCESCFNRGDLIWHLAN W_N 55 

* SAG 

S D S G 

ACTTCAGCTGGTCTATAATATTGATCATTTTTATTACTGTGCTACAATATGGAAGACCTCAATTCAGCTGGTTCGTGTATGGCATTAAAATGCTTATAATGTGGCTATTATGGCCGATTG 600 

R P Q F S 95 

V A V 

TTTTGGCTCTTACGATTTTTAATGCATACTCGGAATACCAAGTGTCCAGGTATGTAATGTTCGGCTTTAGTATTGCAGGTGCAATTGTTACATTTCTACTCTCGATTATGTATTTTCTAA 720 

S E Y Q V S R R 135 

V 

GATCCATTCAGTTGTACAGAAGGACTAAGTCTTGGTGGTCCTTCAACCCTGAAACTAACGCAATTCTTTGCGTTAGTGCATTAGGAAGAAGCTATGTGCTTCCTCTCGAAGGTGTGCCAA 040 

SIQLYRRTKSWWSFNPETNA1LCVSALGRSYVLPLEGVPT 175 

I 

K 

CTGGTGTCACTCTAACTTTGCTTTCAGGGAATTTGTACGCTGAAGGGTTCAAAATTGCAGGTGGTATCACCATCGACAATTTGCCAAAATACGTAATGGTTGCATTACCCAGCAGGACTA 960 

GVTLTLLSGNLYAEGFKIAGGMTIDNLPKYVMVALPSRTI 215 

N 

N 

TTGTTTACACACTTGTTGGCAAGAAGTTGAAGGCAAGTAGTGCGACTGGATGGGCTTACTATGTAAAATCTAAAGCTGGTGATTACTCAACAGAGGCAAGAACTGATAATTTGAGTGAGC 1080 

VYTLVGKKLKASSATGWAYYVKSKAGDYSTEARTDNLSEQ 255 

D 

AAGAAAAATTATTACATATGGTATAACTAAACTTCTAAATGGCCAACCAGGGACAACGTGTTAGTTGGGGGGATGAATCCACCAAAATACGTGGTCGCTCCAATTCCCGTGGTCGGAAGA 1200 

EKLLKMV* 262 

Nucleoprotein MANQGQRVSWGDESTK I RGRSNSRGRK I 28 

FS772/70 S 

Purdue-115 T N 

Oligo 25 


GTTATTGGAATAGACAAACTCGCTATCGCATGGTGAAGGCCCAACGTAAAGAGCTTCCTGAAAGGTGGTTCTTTTACTACTTAGGCACTGGACCTCATGCAGATGCCAAATTTAAAGATA 1440 
YWNRQTRYRMVKGQRKELPERWFFYYLGTGPHADAKFKDK 108 


AATTTCAACTTGAAGTTAACCAGTCTAGGGACAACTCAAGGTCACGCTCTCAATCTAGATCGCGGTCTAGAAACAGATCTCAATCTAGAGGTAGGCAACAATCCAATAACAAGAAGGATG 1680 
FQLEVNQSRDNSRSRSQSRSRSRNRSQSRGRQQSNNKKDD 188 

F 

ACAGTGTAGAACAAGCTGTTCTTGCCGCACTTAAAAAGTTAGGTGTTTACACAGAAAAACAACAGCAACGCTCTCGTTCTAAATCTAAAGAGCGTAGTAACTCTAAAACAAGAGATACTA 1800 
SVEQAVLAALKKLGVYTEKQOQRSRSKSKERSNSKTRDTT 228 

D C 

D 


CGCCTAAGAATGAAAACAAACACACCTGGAAGAGAACTGC aggt AAAGGTGACGTGACAAG ATTTTATGGAGCTAGAAGCAGCTC agccaattttggtgac AGTGACCTCGTTGCC AATG 1920 
PKNENKHTWKRTAGKGDVTRFYGARSSSANFGDSDLVANG 268 

S T 

GGAGCAGTGCCAAGCATTACCCACAATTGGCTGAATGTGTTCCATCTGTGTCTAGCATTTTGTTTGGAAGCTATTGGACTTCAAAGGAAGATGGCGACCACATAGAAGTCACGTTCACAC 2040 
SSAKHYPQLAECVPSVSS ILFGSYWTSKEDGDQ I EVTFTH 308 


ACAAATACCACTTGCCAAAGGATCATCCTAAAACTGAACAATTCCTTCAGCAGATTAATGCCTATGCTTGCCCATCAGAAGTGGCAAAAGAACAGAGAAAAAGAAAGTCTCGTTCTAAAT 2160 
KYHLPKDHPKTEQFLQQINAYACPSEVAKEQRKRKSRSKS 348 
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CTGCAGAAAGGTCAGAGCAAGAGGTGGTACCTGATTCATTAATAGAAAACTATACACATGTGTTTGATGACACACAGGTTGAGATGATTGATGAGGTAACGAACTAAACGAGATGCTCGT 2280 
AERSEQEVVPDSL I ENYTDVFDDTQVEMIDEVTN * 382 

A ORF-7 M L V 3 

D A I FS772/70 

Purdua-115 

CCTCCTCCAGGCTGTATGTATTACAGTTTTAACCTTACTACTAATTGGTAGACTCCAATTATTAGAAAGATTATTACTTAATCACTCTTTCAATCTTAAAACTGTTGATGATTTTAATAT 2400 

LLQAVCITVLTLLLIGRLQLLERLLLNHSFNLKTVDDFNI 43 

H F L A 

F H F I D N 

CTTATAT AGGAGTTTAGOAGAAACTAGATTACTAAAAGTGGTGCTTCGATTAATC TTTTT AGTCTT ATT AGGATTTTGCTCC T ATAG ATTGTT AG TC AT ATT AATGT AAGGCAACCCGAT 2520 
LYRSLAETRLLKVVLRLIFLVLLGFCCYRLEVILM* 78 

V TV 

GTATAAAACTGGTTTTTCCGAGGAACTACTGGTCATCGCGCTGTCTACTCTTGTACAGAATGGTAAGCACGTGTAATAGGAGGTACAAGCAACCCTATTGCATATTAGGAAGTTTAGATT 2840 
TGATTTGGCAATGCTACATTTAGTAATTTAGAGAAGTTTAAAGATCCPCTACGACGAGCCAACAATGGAAGAGCTAACGTCTGGATCTAGTGATTGTTTAAAATGTAAAATTGTTTGAAA 2760 

Oligo 85 

<—_--- 

ATTTTCCTTTTGATAGTGATATAC AAAAAA 2790 

Fig. 3. The nucleotide sequence of the PRCV (86/137004) genome from the 3'-end of ORF-3 gene to 
the poly (A) tail. The amino acid sequences corresponding to the carboxyl-terminus of ORF-3, ORF-4, 
M, N, and ORF-7 gene products are shown below the nucleotide sequences. The horizontal arrows 
show the position and orientation of the primers used to generate the PCR fragments. The position of 
oligo 75 is not shown as it corresponds to the 5'-end of the leader RNA sequence and was used to 
generate PCR fragments from the two smallest mRNA species, mRNAs 6 and 7. The position of oligo 
60 is not shown as it corresponded to a region within PRCV ORF-3, not shown in this figure, described 
by Page et al. (1991). Amino acid substitutions found on the TGEV FS772/70 (Britton et al., 
1988a;1988b) and Purdue-115 (Laude et al., 1987; Rasschaert et al., 1987) sequences are shown below 
the PRCV sequences. Positions of the putative RNA-leader binding sites CTAAAC for ORF-4 and 
ACTAAAC for M, N and ORF-7 genes are shown as thick lines above the nucleotide sequence. The 
predicted N-terminal membrane signal sequence of the M protein is double underlined. The thick lines 
below the M amino acid sequence show the positions of the predicted transmembrane domains. The 
number sign shows the position of the amino acid deleted from the ORF-4 gene in TGEV FS772/70. 
The black triangle shows the position of potential ,V-glycosylation sites. The nucleotide sequence data 
reported in this paper for PRCV strain 86/137004 have been submitted to the EMBL/Genbank/DDBJ 
nucleotide sequence databases and have been assigned the accession number X60056, 


The observation that the PRCV fragment A was about 600 bp smaller than the 
equivalent TGEV fragment would account for the observed difference in the size 
of PRCV mRNA 2 (Britton et al., 1990; Page et al., 1991). The PRCV cDNA 
fragments A to H were cloned into pUC13 and the corresponding plasmids 
PPR137-7 (A), pPR137-9 (B), pPR137-ll (C), pPR137-13 (D), pKP-1 (E), 
pPR137-5 (F), pPR137-l (G) and pPR137-3 (H) were used for DNA sequencing. 

Sequencing of PRCV cDNA 

The PRCV cDNA from the above plasmids, except pKP-1, was subcloned into 
M13 vectors and ssDNA templates sequenced. The PRCV cDNA from pKP-1 was 
sequenced as described by Page et al. (1991). 

The sequence of 4320 bp (Fig. 2), derived from the PRCV cDNA in fragments 
A-D and part of E, is shown in Fig. 2. The sequence was translated in all six 
reading frames. An ORF of 3678 bp, nucleotides 394-4071, corresponding to a 
gene product of 1225 amino acids preceded by the potential RNA polymerase- 
leader complex binding site, ACTAAAC, 26 bp upstream of the initiation site (Fig. 
2) was identified as the PRCV S gene. Comparison of the PRCV and TGEV S 
genes identified a deletion of 672 nucleotides, corresponding to 224 amino acids 
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(TGEV residues 21 to 245), 59 nucleotides downstream of the PRCV S gene 
initiation codon (Figs. 2 and 4). The rest of the deduced amino acid sequence of 
the PRCV S protein showed 98% similarity to the TGEV S protein. The 5'-end 
region of the PRCV S gene from a second British isolate of PRCV, strain 
86/135308, was cloned and sequenced and found to have an identical deletion. 
The first 20 amino acids of the PRCV S protein were identical to those found on 
TGEV of which the first 16 fulfil the criteria for a eukaryotic signal sequence with 
the potential cleavage site between the glycine (16) and aspartic acid (17) residues 
(Fig. 2). This cleavage site has been confirmed, for the avirulent Purdue strain of 
TGEV, by N-terminal amino acid sequencing of the S protein isolated from virions 
(Rasschaert and Laude, 1987). Assuming that the PRCV signal sequence is cleaved 
the PRCV S protein would comprise a polypeptide of 1209 amino acids with a M r 
132,897 compared to 1433 amino acids with a M r 158,160 for TGEV S protein. 

An incomplete ORF of 397 bp was identified, nucleotides 2-397 (Fig. 2), which 
consisted of 131 amino acids that terminated at a TGA stop codon. The deduced 
amino acid sequence of this ORF had 99.2% similarity with 97.7% identity to an 
ORF found upstream of the TGEV FS772/70 S gene, previously identified as the 
C-terminal end of the TGEV ORF-lb polymerase subunit, because of its homology 
to the ORF-lb polymerase subunits of infectious bronchitis virus (IBV; Boursnell 
et ah, 1987) and mouse hepatitis virus (MF1V; Bredenbeek et al., 1990). 

A sequence of 57 amino acids, corresponding to nucleotides 4145-4318 (Fig. 2), 
identified downstream of the PRCV S gene that had no ACTAAAC site or 
initiation codon had 71.9% similarity with 70.2% identity to part of the TGEV 
FS772/70 ORF-3a gene indicating that they formed part of a pseudogene corre¬ 
sponding to TGEV ORF-3a. The 174bp PRCV ORF-3a pseudogene had the first 
three amino acids deleted and contained a 36 base deletion which resulted in the 
loss of the last 13 amino acids when compared to TGEV ORF-3a. However, an in 
frame fusion at nucleotide 4284 (Fig. 2) resulted in 11 extra amino acids at the 
C-terminal end of the PRCV pseudogene product which were identical to the 
amino acids found at the end of ORF-3a from the Miller strain of TGEV (Wesley 
et al., 1989). Work by Page et al. (1991) has shown that PRCV has no mRNA 
species with an ORF equivalent to TGEV ORF-3a at the 5'-end indicating that a 
gene product equivalent to TGEV ORF-3a will not be produced by PRCV though 
part of this potential TGEV gene appears as a pseudogene in the PRCV genome. 

Sequence data of 2790 bp from the PRCV cDNA fragments F-H is shown in 
Fig. 3. Four complete ORFs, nucleotides 59-304, 318-1103, 1119-2264 and 
2273-2506, corresponding to gene products of 82, 262, 382 and 78 amino acids 
(Fig. 3). The nucleotide sequence of the PRCV cDNA showed 98% similarity with 
the equivalent region from TGEV. The amino acid sequences of the four ORFs 
were almost identical to the ORF-4, M, N and ORF-7 gene products of TGEV 
(Fig. 3). The initiation codon of ORF-4 was preceded by the hexamcric sequence, 
CTAAAC, and the initiation codons of the M, N and ORF-7 genes were preceded 
by the heptameric sequence, ACTAAAC. 

The potential product of the PRCV ORF-4 gene is very similar to the equiva¬ 
lent TGEV gene product (Fig. 2) and showed 96% identity to the FS772/70 
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(Britton et al,, 1989) and Purdue-115 (Rasschaert et al., 1987) but 100% identity to 
the virulent Miller (Wesley et al., 1989) strains of TGEV. The isoleucine at residue 
55 of the Purdue-115 and Miller TGEV sequences, found to be deleted from the 
FS772/70 sequence, was present in the PRCV sequence. There were three amino 
acid substitutions between the PRCV and FS772/70 ORF-4 sequences and four 
amino acid substitutions between the PRCV and Purdue-115 ORF-4 sequences, 
none of which occur between the two TGEV sequences. 

Analysis of the PRCV M protein amino acid sequence by the method of von 
Heijne (1986) identified a potential membrane signal sequence with the cleavage 
site between the Gly 16 and Glu 17 residues (Fig. 3). A similar M protein membrane 
signal sequence was identified in the TGEV (Laude et al., 1987; Britton et al., 
1988b) and on the M protein from the antigenically related coronavirus feline 
infectious peritonitis virus (FIPV; Vennema et al., 1991). 

Comparison of the PRCV N protein and ORF-7 sequences to those of TGEV 
FS772/70 (Britton et al., 1988a) and Purdue-115 (Kapke and Brian 1986; Rass¬ 
chaert et al., 1987), Fig. 3, showed that there were no deletions or insertions within 
the PRCV genes. No deletions were found within the 3' non-coding region in 
contrast to the two and five base deletions observed in the non-coding region of 
the RM4 isolate of PRCV (Rasschaert et al., 1990). The N protein of PRCV 
contained eight or 14 amino acid substitutions when compared to the FS772/70 
and Purdue-115 sequences respectively (Fig. 3) of which six were identical between 
the TGEV strains. The PRCV sequence contained the octameric sequence, 
GGAAGAGC, at the 3'-end of the genome, upstream of the poly (A) site, 
conserved in all coronavirus sequences todate. 


Discussion 

In this study the 3'-end of the genome from a British isolate of PRCV was 
cloned and sequenced. The region analysed consisted of 7822 nucleotides and 
extended from the 3'-end of the lb subunit of the PRCV polymerase gene to the 
start of the poly(A) tail. Previous work (Britton et al., 1990; Page et al., 1991) had 
shown that two of the PRCV mRNA species, mRNA 2 and mRNA 3, were smaller 
than the corresponding TGEV species. We concluded that PRCV mRNA 3 was 
smaller due to the creation of a new putative RNA-leader binding site upstream of 
a gene equivalent to the TGEV ORF-3b gene with the loss of the putative RNA 
binding site upstream of TGEV ORF-3a. The new putative RNA-leader binding 
site resulted in a PRCV mRNA 3 species, about 200 nucleotides smaller than the 
equivalent TGEV mRNA, with a gene equivalent to TGEV ORF-3b at the 5'-end. 
The study also identified several small deletions, 84 nucleotides in total, in the 
PRCV genome corresponding to the 5'-end of TGEV mRNA 3. Two of the 
deletions were in the area corresponding to the potential TGEV gene, ORF-3a, 
resulting in the loss of this potential gene in PRCV, although part of the gene 
remained as a pseudogene. This observation was confirmed on a second British 
isolate (Page et al., 1991) and by Rasschaert et al. (1990) for a French isolate of 
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PRCV indicating that it may be a common feature of PRCV. However, these small 
deletions could not account for the smaller size of PRCV mRNA 2. This was due 
to a major deletion of 672 nucleotides, corresponding to 244 amino acids, equiva¬ 
lent to amino acid residues 5 to 229 on the TGEV S protein following cleavage of 
the membrane signal (Fig, 4). The deletion was shown to be present on two British 
isolates of PRCV and a French isolate (Rasschaert et al., 1990) indicating that this 
deletion may also be a common feature of PRCV. A cDNA probe, pD24, 
corresponding to amino acids 66 to 154 of the S protein from TGEV (Miller strain) 
reacted with the Miller and Purdue-115 strains of TGEV but not to FIPV, feline 
enteric coronavirus (FECV), canine coronavirus (CCV) or PRCV (strain ISU-1) 
isolated from a pig in Indiana, U.S.A. (Bae et al., 1991). On the other hand a 
cDNA probe, pE21, corresponding to amino acids 325 to 451 of the Miller S 
protein, reacted with both TGEV strains, PRCV (ISU-1), FIPV, FECV and CCV. 
The region of the Miller S gene corresponding to probe pD24 was within the 
region deleted from PRCV implying that the American isolate of PRCV may also 
contain the S gene deletion. The sequence data presented in this study showed 
that the gene encoding the PRCV ORF-lb polymerase subunit is directly upstream 
of the S gene, in a similar position as observed for TGEV, with no intervening 
ORFs, in contrast to bovine coronavirus (BCV) and MHV where other genes or 
pseudogenes are present. Therefore the results presented in this paper and from 
previous work indicate that the order of viral genes in PRCV was 5'-[lb-S-3-4-M- 
N-7]-3' whereas TGEV has the gene order 5'-[lb-S-3a-3b-4-M-N-7]-3'. The se¬ 
quence data also showed that PRCV had only diverged by about 4% when 
compared to TGEV sequence data. 

Previous comparisons of the S protein sequences from two virulent TGEV 
strains, FS772/70 (Britton and Page, 1990) and Miller (Wesley 1990), and one 
avirulent strain, Purdue-115 (Jacobs et al., 1987; Rasschaert and Laude, 1987), 
identified a six base insert in the genomes of the virulent strains resulting in two 
extra amino acids with an extra potential N-glycosylation site and an amino acid 
substitution at residue 384, serine to phenylalanine, which prevented the binding 
of MAbs belonging to site C (Delmas et al., 1990) or site D (Correa et al., 1988; 
1990; Posthumus, 1990). The British and French PRCV isolates were shown to 
contain both the six nucleotide insert and point mutation. Comparison of S protein 
sequences from the antigenically related coronaviruses, TGEV and FIPV, showed 
94% similarity except for the first 267 amino acids which had only 30% similarity 
(Jacobs et al., 1987). The deletion within the PRCV S protein was in the same 


Fig. 4. Alignment of the S protein amino acid sequences from TGEV strains FS772/70 (Britton and 
Page, 1990), Miller (Wesley 1990) and Purdue-115 (Rasschaert and Laude, 1987), PRCV 86/137004 
(this paper) and FIPV 79-1146 (De Groot et al., 1987) by the CLUSTAL program. The plusses are 
padding characters for optimal alignment of the sequences. The asterisks below the sequences show 
identical amino acids in all five viruses and the minus is used if the amino acids match from four 
viruses. The thick line above the sequences shows the position of the heptad repeat identified by 
Rasschaert and Laude (1987) for TGEV and the row of arrow heads shows the position of the potential 
transmembrane domain also present in all the virus sequences. 
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MKKLFVVLVVM++++PLIYGDNFPCSKLTNRTIGNHWNLIETFLLNYSSRLSPNSDVVLGDYFPTVQPWFNCIHNNSNDLYVTLENLKALYW++DYATEN 
MKKLFWLWM++++PLIYGDNFPCSKLTNRTIGNHWNLIETFLLNYSSRLSPNSDWLGDYFPTVQPWFNCIRNNSNDLYVTLENLKALY1I++DYATEN 
MKKLFWLWM+ * ++PLIYGDNFPCSKLTNRTIGNQWNLIETFLLNYSSRLP PNSDWLGDYFPTVQP WFNCIRNNSNDLYVTLENLKALYW+ +DYATEN 

MKKLFWLWM++++PLIYGDKFP+++++++++++++++++++++++++++4-+++++++++++++++++-f++++++++++++++++ +++++++++++++ + 
MIVLVTCLLLLCSYHTVLSTTNNECIQVNVTQLAGNENLIRDFLF+ + + SNFKEEGSWVGGYYPT+EVWYNCSRTARTTAFQYFNNIHAFYFVMEAMENS 


STWNHKQRLNVWNGYPYSITVTTTRNFNSAEGAII + + +CICKGSPPTTTTESSLTCNWGSECRLNHK+++PPICP SNSEANCGNMLYGLQWFADAWAY 

ITLNHKQRLNVWNGYPYSITVTTTRNFNSAEGAII+++CICKGSPPTTTTE55LTCNWGSECRLNHK+++FPICPSNSEANCGNMLYGLQWF ADAWAY 
ITWNHRQRLNVVVNGYPYSITVTTTRNFNCAEGAIY+++MHCKGSPPTTTTESSLTCNWGSECRLNHK+ + + FPICPSNSEANCGNMLYGLQWFADEWAY 

TGNARGKPLLFHVHGEPVSVIISAYRD+DVQQRPLLKHGLVCITKNRHINYEQFTSNQWNSTCTGADRKIPFSLIPTDN++++GTKIYGLEWNDDFVTAY 


LHGAS YRISFENQWSGTVTLGDMRATTLETAGTLVDLWWFNP VY++++++DVS YYRVNNKNGTTWSNCTD++QCAS YVANVFTTQPGGFIP SDFSFNNW 
LHGASYRISFENQWSGTVTLGDMRATTLETAGTLVDLWWFNP VY+++ +++DVSYYRVNNKNGTTWSNCTD++0C ASYVANVFTTQPGGF IP SDFSFNNW 
LHGASYRISFENQWSGTVTFGDMRATTLEVAGTLVDLWWFNPVY++++++DVSYYRVNNKNGTTWSNCTD++QCASYVANVFTTQPGGFIPSDFSFNNW 
+++++++ ++++++++++++++++++♦+++++*♦+♦+++*++++♦■++++♦+++++++++++++TSWSNCTD++QCASYVANVFTTQPGGFIPSDFSFNNW 
ISGRSYH LNINTNWFNNVTLLYSRSSTAT+++++♦ + +WEYSAAYAYQGVSNFTYYKLNNTNGLKTYELCEDYQHCTGYATNVFAPTSGGYIPDGFSFNNW 

_ _ *_* _*__* _ * * * _ _•*_** _ ****** 


FLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTFCFEGAGFDQCNGAVLNNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGGVTLEISCYNDTVSD 
FLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTLCFEGAGFDQCNGPVLNNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGGVTLEISCYNDTVSD 
FLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTFCFEGAGFDQCNGAVLNNTVDVIRFNLNFTTNVQSGKGATVFSLNTTGGVTLEISCY++TVSO 
FILTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTFCFEGADFDQCNGAVLNNTVDVIRFNLNFTTOVQSGKGATVFSLNTTGGVTLEISCYNDTVSD 
FLLTNSSTFVSGRFVTNQPLLINCLWPVPSFGVAAQEFCFEGAQFSQCNGVSLNNTVDVIRFNLNFTADVQSGMGATVFSLNTTGGVILEISCYSDTVSE 
*_******_***__**_****_*********_ ** _..***** •_**** ***************__****_*************_****** ***_ 


SSFSSYGEIPFGVTDGPRYCYVLYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYTSYTEALVQVENTAITK 
SSFSSYGEMPFGVTDGPRYCYVLYNGTA1KYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYTSYTEALVQVENTAITK 
SSFFSYGEIPFGVTDGPRYCYVHYNGTAIKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYTSYTEALVQVENTAITK 
SSFSSYGEIPPGVTNGPRYCYVLYNGTALKYLGTLPPSVKEIAISK1IGHFYINGYNFFSTFPIDCISFNLTTGDSDVFWTIAYTSYTEALVQVENTAITN 
SSSYSYGEIPFGITDGPRYCYVLYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNFFSTFPIGCISFNLTTGVSGAFWTIAYTSYTEALVQVENTAIKN 
** *•**.**«_*_*•***** ****************************************-********* *__*************•*******_ 


VT YCNS YVNNIKCSQLT ANLNNGF YP VSSSEVGFVNKS WLLPTF YTHTIVNITIGLGMKRSGYGQPIASTLSNITLPMQDNNIDVYCIRSDQFS VYVHS 
VTYCNSYVNNIKCSQLTANLNNGFYPVSSSEVGLVNKSWLLPSFYTHTIVNITIGLGMKRSGYGQPIASTLSNITLPMQDNNTDVYCIRSDQFSVYVHS 
VTYCNSHVNNIKCSQITANLNNGFYPVSSSEVGLVNKSWLLPSFYTHTIVNITIGLGMKRSGYGQPIASTLSNITLPMQDHNTDVYCIRSDQFSVYVHS 
VTYCNSYVNNIRCSQLTANLNNGFYPVSSSEVGSVNKSWLLPSFLTHTIVNITIGLGMKRSGYGQPIASTLSNITLPMQDNNTDVYCVRSDQFSVYVHS 
VTYCNSHINNIKCSQLTANLNNGFYPVASSEVGFVNKSWLLPSFFTYTAVNITIDLGMKLSGYGQPIASTLSNITLPMQDNNTDVYCIRSNQFSVYVHS 
******__*******_***********_***** «•****«**_* *_* *****_**** ********************_*_****_**_******** 


TCKSALWDNVFKRNCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRANDQWRSLYVIYEEGDNIVGVP5DNSGLHDLSV 
TCKSSLWDNVFKRNCTDVLDATAVIXTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRTNDQWRSLYVIYEEGDNIVGVPSDNSGLHDLSV 
TCKSALNDNIFKRNCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRTNEQWRSLYVIYEEGDNIVGVPSDNSGVHDLSV 
TCKSALWDNVFKRNCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRTNOQFVRSLYVIYEEGDSIVGVPSDNSGLHDLSV 
TCKSSLWDNIFNQDCTDVLEATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGANCKFDVAARTRTNEQWRSLYVIYEEGDNIVGVPSDNSGLHDLSV 
****_****_*•__*****_**********************************************—*—* **•*********_**********_***** 


LHLDSCTDYNIYGRSGVGIIRQTNRTLLSGLYYTSLSGDLLGFKNVSDGVIYSVTPCDVSAQAAVIDGTIVGAITSINSELLGLTHHTTTPNFYYYSIYN 

LHLDSCTDYNIYGRTGVGIIRQTNRTLLSGLYYTSLSGDLLGFKNVSDGVIYSVTPCDVSAQAAVIDGTIVGAITSINSELLGLTHNTTTPNFYYYSIYN 

LHLDSCTDYNIYGRTGVGIIRQTNRTLISGLYYTSLSGDLLGFKNVSDGVIYSVTPCDVSAQAAVIDGTIVGAITSINSELLGLTHWTTTPNFYYYSIYN 

LHLDSCTOYNIYGRTGVGIIRQTNRTLLSGLYYTSI.SGDLLGFKNVSDGVIYSVTPCDVSAQAAIIDGAIVGAITSINSELLALTHWTITPNFYYYSIYN 

LHLDSCTDYNIYGRTGVGIIRRTNSTLLSGLYYTSLSGDLLGFKNVSDGVIYSVTPCDVSAQAAVIDGAIVGAMTSINSELLGLTHWTTTPNFYYYSIYN 


YTNDWTRGTAIDSNDVDCEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLTQYV 
YTNDRTRGTAIDSNDVDCEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLTQYV 
YTNDRTRGTAIDSNDVDCEPVITYSNIGVCKNGAFVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLTQYV 
YTNDKTRGTPIGSNDVDCEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYIQVYTTPVSIDCSRYVCNGNPRCNKLLTQYV 
YTSERTRGTAIDSNDVDCEPVITYSNIGVCKNGALVFINVTHSDGDVQPISTGNVTIPTNFTISVQVEYMQVYTTPVSIDCARYVSNGNPRCNKLLTQYV 


SACQTIEQALAVGARLENMEVDSMLFVSENA1KLASVEAFNSSETLDPIYKEWPNIGGSWLEGLKYILPSDNSKRKYRSAIEDLLFSKWTSGLGTVDED 
SACQTIEQALAMGARLENMEVGSMLFVSENALKLASVEAFNSSETLDPIYKEWPNIGGSWLEGLKYILPSDNSKRKYRSAIEDLLFAKWTSGLGTVDED 
SACQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSSETLDPIYKEWPNIGGSWLEGLKYILPSHNSKRKYRSAIEDLLFDKWTSGLGTVDED 
SACQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSSETLDP I YKEWPN IGGFWLEGLKYILPSDNSKRKYRSAIEDLLFSKWTSGLGTVDED 
SACQTIEQALAMGARLENMEVDSMLFVSENALKLASVEAFNSTENLDPIYKEWPSIGGSWLGGLKDILPSHNSKRKYGSAIEDLLFDKWTSGLGTVDED 
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N-terminal region where the FIPV sequence differed from TGEV (Fig. 4) indicat¬ 
ing that this region of the S protein may be involved in the different tropisms 
observed for the three viruses. 

Fleming et al. (1986) indicated that the MHV S protein encoded the determi¬ 
nants required for binding to susceptible cells. Comparison of the S protein amino 
acid sequences from MHV strains, JHM (Schmidt et al., 1987) and A59 (Luytjes et 
al., 1987), identified an 89 amino acid deletion in the JHM sequence. Comparison 
of the S protein sequence from MHV-4 and several neuro-attenuated variants of 
MHV identified a polymorphic region with respect to deletions ranging from 142 
to 159 amino acids (Parker et al., 1989). The S protein from MHV-4 had an insert 
of 141 amino acids when compared to MHV JHM. These observations indicate 
that deletions in the S protein of coronaviruses may be a natural way of altering 
the tropism and concurrently the pathogenicities of the viruses. The exact mecha¬ 
nism for the deletion events is not known, there is no evidence for repeat 
sequences or secondary structures allowing the jumping of the polymerase. There 
is evidence that some coronaviruses may undergo recombination events and this 
mechanism cannot be ruled out for the introduction of specific deletions within 
areas of the genomes. 

The potential TGEV ORF-3a gene has been identified in three different strains 
of TGEV. Purdue-115 (Rasschaert et al., 1987), FS772/70 (Britton et al., 1989) 
and Miller (Wesley et al., 1989); however, the C-tcrminal end was found to differ 
between the strains. Due to the deletions in the PRCV genome a gene equivalent 
to TGEV ORF-3a was present as a pseudogene, but interestingly, the C-terminal 
end of the gene was the same as the Miller strain. An avirulent small plaque 
variant of the Miller strain was shown to have 462 nucleotides deleted eliminating 
ORF-3a (Wesley et al., 1990b). These observations indicate that ORF-3a is not 
needed for propagation of TGEV in vitro and in vivo but whether it plays some 
role in the pathogenicity of TGEV has yet to be elucidated. 

Comparison of the ORF-4 genes showed three substitutions between PRCV 
and TGEV (FS772/70) and four different substitutions for TGEV (Purdue-115). 
The amino acid, residue 55 of Purdue-115, found to be deleted in the FS772/70 
strain of TGEV (Britton et al., 1989), was present in PRCV. No product has been 
assigned to this gene for TGEV. It is interesting that the mRNA4 species, 
encoding this gene, is the lowest abundance mRNA in all TGEV and PRCV 
strains examined to date, which might be due to the loss of the adenosine residue 
at the 5'-end of the putative RNA-Ieader binding site. 

Previous comparison of the M protein sequences of the TGEV FS772/70 
(Britton et al., 1988b) and two independently published sequences of Purdue-115, 
determined by Kapke et al. (1987) and Laude et al. (1987), revealed 96% similarity 
with 11 and 12 amino acid substitutions respectively. Comparison of the M protein 
sequences between PRCV and these TGEV strains showed 97% homology, how¬ 
ever, some amino acid substitutions between PRCV and one strain of TGEV were 
not present on the other (Fig. 3), indicating that the degree of divergence of the M 
proteins of PRCV and TGEV was no greater than that observed between two 
strains of TGEV. There were 5 amino acid substitutions (residues 10, 14, 33, 44 
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and 198) on the PRCV sequence not found on either TGEV strain. Two of these 
(residues 10 and 14) were within the predicted PRCV signal sequence and two 
(residues 33 and 44) within the N-terminal domain of the molecule, postulated to 
be outside the virion envelope, and the fifth (residue 198) was within the C-termi- 
nal domain. Although one of the PRCV substitutions was within a predicted 
N-glycosylation site (glycine (33) in PRCV and serine in TGEV) this should not 
affect the addition of N-glycans to the asparagine residue at amino acid 32. 

The PRCV ORF-7 gene product contained four or nine amino acid substitu¬ 
tions when compared to the FS772/70 and Purdue-115 sequences respectively 
(Fig. 3). These observations together with the comparisons of the number of amino 
acid substitutions between PRCV and the other TGEV genes indicated that 
PRCV was more closely related to the virulent strains of TGEV. The ORF-7 gene 
product has been detected in TGEV (Garwes et al., 1989) and PRCV (Britton et 
al., 1990) infected cells using antisera raised against a synthetic oligopeptide, 
derived from the TGEV ORF-7 sequence, indicating that this gene product has a 
function in viral replication. Both the TGEV and PRCV ORF-7 gene product 
sequences have a high similarity to the penultimate ORF in FIPV (De Groot et al., 
1988) indicating that this gene may be indicative for the TGEV family of coron- 
aviruses. 

The data presented in this paper along with the data presented by Page et al. 
(1990), showing that the putative leader RNA sequence for TGEV and PRCV are 
identical, indicate that the regions of PRCV sequenced so far show good homology 
to TGEV. The observed differences between PRCV and TGEV genomes consisted 
of deletions and point mutations with no sequences unique to PRCV being 
identified. As these differences were no greater than between two different strains 
of TGEV eg Purdue-115 and FS772/70 the results suggest that PRCV is a variant 
of TGEV. 
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